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Summary 

Problem.  During  combat,  documentation  of  medical  treatment  information  is  critical  for 
maintaining  continuity  of  patient  care.  However,  knowledge  of  the  prior  status  and 
treatment  of  patients  is  limited  to  the  information  noted  on  a  paper  Field  Medical  Card 
(FMC).  MEDTAG,  an  electronic  hand-held  field  medical  documentation  device,  is 
designed  to  write  and  store  an  individual’s  medical  data  to  a  smart  card  (the  Multi¬ 
technology  Automated  Reader  Card  [MARC]).  The  MEDTAG’ s  two-button  data  entry 
method  has  been  shown  to  document  more  information  more  quickly  than  the  paper 
FMC.  Recently,  considerable  interest  in  voice  data  entry  methods  has  been  shown.  This 
interest  in  speech  recognition  technology  for  documenting  battlefield  medical  care  is 
motivated  by  the  need  to  gather  information  quickly  and  accurately  in  an  environment 
where  the  corpsman’s  eyes  and  hands  are  busy  delivering  medical  care.  It  is  hoped  that 
this  “multitasking”  will  maximize  the  time  available  for  clinical  care.  While  the 
technology  for  recognizing  natural  speech  is  advancing  rapidly,  a  huge  gap  still  exists 
between  human  speech  recognized  by  the  human  ear  and  speech  recognizable  by  a 
computer.  Nevertheless,  speech  recognition  technology  has  reached  a  level  where,  if 
applications  are  chosen  appropriately,  they  can  provide  a  means  for  communication 
between  humans  and  computers,  which,  although  not  error-free,  are  approaching 
acceptable  ranges.  Research  suggests  that  care  must  be  taken  when  evaluating  the  utility 
of  human-machine  voice  communication  for  new  applications. 

Objective.  The  objective  of  this  study  was  to  evaluate  the  speed  and  accuracy  of  three 
data  entry  methods  (keyboard,  two-button,  and  voice)  for  documenting  casualty  care.  In 
addition,  perceptions  of  corpsmen  regarding  the  ease  of  learning  and  using  these  input 
methods  were  gathered. 

Approach.  For  this  study,  a  desktop  computer  was  configured  to  simulate  the  operation 
of  the  MEDTAG.  The  MEDTAG  software,  developed  by  the  Naval  Health  Research 
Center,  was  used.  This  MEDTAG  simulator,  called  MEDSIM,  was  used  as  the  platform 
for  evaluating  the  speed  and  accuracy  of  entering  medical  treatment  data  using  a  standard 
keyboard,  the  two-button  MEDTAG  model,  and  a  speech  recognition  system.  To 
evaluate  the  speed  and  accuracy  of  medical  treatment  data  entry,  corpsmen  were  trained 
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and  then  instructed  to  document  injury,  treatment,  patient  condition,  and  disposition  data 
for  two  hypothetical  patients.  Measurements  of  time  and  accuracy  to  enter  these  data 
were  gathered  for  each  input  method.  The  experimental  design  was  a  one-way,  repeated- 
measures  design  using  the  three  methods  of  data  entry  as  the  variable  of  interest. 

Results.  Results  showed  that  the  MEDTAG  two-button  entry  method  for  documenting 
casualty  care  was  the  fastest,  followed  by  the  keyboard  and  the  voice  data  entry  methods, 
respectively.  The  two-button  method  was  8%  faster  than  voice  data  entry.  Fewer 
content  errors  were  made  using  the  speech  recognition  method  compared  with  the 
keyboard  and  the  two-button,  but  the  differences  were  not  significant.  Significantly 
fewer  scrolling  errors  were  made  using  the  voice  method  than  using  the  other  two 
methods.  In  general,  corpsmen  reported  that  keyboard  and  speech  were  easiest  to  learn 
and  to  use  for  inputting  data.  They  reported  that  the  keyboard  and  the  two-button  method 
took  less  time  compared  with  the  FMC,  and  that  mistakes  were  more  likely  to  occur  using 
the  two-button  method.  When  asked  which  method  they  would  prefer  to  use,  which 
would  work  best  in  combat,  and  which  would  most  improve  field  medical  care,  they  chose 
the  two-button  method  most  frecjuently.  Finally,  when  asked  which  method  allowed  them 
freedom  of  their  hands  and  interfered  least  with  their  duties,  they  chose  the  voice  input 
method  most  frequently. 

Discussion.  In  general,  the  speech  recognition  method  was  found  to  be  slower,  yet 
somewhat  more  accurate,  than  either  the  keyboard  or  the  two-button  method.  Further, 
users  reported  a  preference  for  the  two-button  method.  These  results  must  be  viewed 
with  the  understanding  that  the  subjects  were  novices  with  respect  to  voice  input,  but 
were  very  experienced  with  keyboard  input.  The  novelty  of  the  speech  recognition  system 
could  account  for  these  findings.  Viewed  in  this  light,  voice  holds  much  promise  as  a 
mode  of  input  for  medical  documentation.  Future  work  will  focus  on  expanding  the 
vocabulary  available  to  the  users  for  documentation,  thus  making  the  interaction  more 
consistent  with  the  way  they  actually  perform  their  jobs. 
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Introduction 


Historically,  the  Field  Medical  Card  (FMC)  has  been  used  to  provide  the  clinical 
documentation  that  moves  with  a  casualty  during  evacuation.  The  Department  of  Defense 
(DoD)  Medical  Readiness  Strategic  Plan  (MRSP),  promulgated  in  Febmary  1988, 
determined  that  the  FMC  was  deficient,  and  a  quad-service  working  group  was  formed  to 
develop  a  revised  card.  Consensus  among  the  services  determined  that  the  revised  card 
captured  all  of  the  information  needed  at  the  first  and  second  echelons  of  medical 
treatment,  and  that  its  format  was  an  improvement  over  the  original  card.  Field  tests, 
however,  revealed  that  the  amount  and  accuracy  of  the  information  obtained  with  the 
revised  card  was  significantly  less  than  what  was  obtained  with  the  original  card  (Wilcox 
&  Pugh,  1990).  This  result  underscored  the  need  to  evaluate  empirically  any 
modifications  to  medical  data  collection  procedures. 

Many  of  the  deficiencies  found  in  both  the  old  and  new  field  medical  cards  are 
inherent  in  any  manual  documentation  procedure.  For  example,  pens  and  pencils  may 
become  lost,  reading  and  writing  on  paper  at  night  is  difficult  or  impossible,  and 
handwriting  is  often  illegible.  As  a  result,  the  feasibility  of  using  automated  methods  was 
studied.  These  studies  led  to  the  development  of  the  MEDTAG  concept  (Galameau  & 
Wilcox,  1993a).  The  MEDTAG,  an  electronic,  hand-held,  two-button  device  was 
designed  so  that  once  activated,  injury,  treatment,  patient  conditions,  and  disposition  data 
could  be  captured.  An  internal  clock  is  used  to  time-date  stamp  all  entries.  Field  tests 
using  a  proof-of-concept  device  demonstrated  not  only  that  the  MEDTAG  provided  more 
accurate  and  complete  information  but  that  it  also  captured  the  data  more  quickly 
(Galameau  &  Wilcox,  1993b). 

To  explore  the  potential  of  the  MEDTAG  concept,  the  Office  of  the  Secretary  of 
Defense  (OASD,  Health  Affairs)  requested  that  the  original  MEDTAG  design  be  modified 
so  that  it  would  interface  with  the  Multi-technology  Automated  Reader  Card  (MARC). 
The  MARC,  a  "smart  card,"  can  store  information  such  as  name,  social  security  number 
(SSN),  blood  type,  and  other  critical  information  that  could  prove  useful  to  emergency 
medical  personnel.  Thus  the  revised  MEDTAG  has  the  capability  of  recording  treatment 
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information  in  batdefield  situations,  such  as  the  type  of  injury,  the  type  and  time  of 
administered  medications,  and  MARC  functions  as  a  field  data  carrier  that  communicates 
medical  information  between  facilities  at  the  forward  echelons  of  care.  The  results  of 
MEDTAG/MARC  field  evaluations  conducted  by  the  25th  Infantry  Division  on  Oahu, 
Hawaii,  indicated  that  the  concept  may  be  capable  of  providing  significant  improvements 
to  the  battlefield  medical  treatment  and  documentation  process  (SRA,  1995). 

Recent  interest  in  new  methods  of  data  capture,  such  as  speech  recognition 
systems  and  biosensor  technology,  is  motivated  by  the  need  to  gather  information  quickly 
and  accurately  to  maximize  the  time  available  for  clinical  care.  Although  the  suggested 
modifications  may  meet  these  goals,  research  is  needed  to  determine  if  the  changes 
required  for  the  MEDTAG  concept  will  have  the  intended  effect. 

Speech  Recognition  Systems  Defined.  Schafer  (1995)  classified  speech 
recognition  systems  according  to  the  scope  of  their  capabilities.  Speaker-dependent 
systems  must  be  “trained”  to  recognize  the  speech  of  an  individual  user,  while  speaker- 
independent  systems  attempt  to  cope  with  the  variability  of  speech  among  speakers.  Some 
systems  recognize  a  large  number  of  words  or  phrases,  while  simpler  systems  may 
recognize  only  a  few  words,  such  as  the  digits  0  through  9.  Finally,  it  is  easier  to 
recognize  isolated  words  than  to  recognize  continuous  free  speech.  Thus,  a  limited- 
vocabulary,  isolated-word,  speaker-dependent  system  would  generally  be  the  simplest  to 
implement,  while  to  approach  the  capabilities  of  the  native  speaker  would  require  a  large- 
vocabulary,  continuous  free  speech,  speaker-independent  system.  The  accuracy  of  current 
speech  recognition  systems  depends  on  the  complexity  of  the  operating  conditions. 
Recognition  error  rates  below  1%  have  been  obtained  for  highly  constrained  vocabulary 
and  controlled  speaking  conditions;  but  for  large  vocabulary,  continuous  free-speech 

systems,  the  word  error  rate  may  exceed  25  %. 

Speech  Recognition  Applications.  The  largest  ongoing  commercial  application 
of  speech  recognition  systems  is  the  automation  of  telephone  operator  services.  The 
vocabularies  have  been  expanded  from  “Yes”  and  “No”  responses  to  include  selection  of 
paying  choice  (e.g.,  collect,  bill  to  third  party)  and  help  commands,  such  as  “operator.” 
Other  speech  recognition  applications  include  the  automation  of  the  operation  of  billing 
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functions,  cellular  voice  dialers  for  autos  (which  promise  the  ultimate  in  safety,  namely 
“eyes-free”  and  “hands-free”  communication),  voice  routing  of  calls,  automatic  creation  of 
medical  reports,  order  entry  (catalogue  sales),  forms  entry  (insurance,  medical),  and  voice 
dictation  (Rabiner,  1995).  Despite  the  advance  in  technology,  human  factors  research 
since  the  1970s  has  provided  no  conclusive  evidence  that  automatic  speech  recognition  is 
superior  to  conventional  input  devices  such  as  the  keyboard  and  mouse,  except  in 
situations  in  which  speech  input  is  the  only  alternative  (e.g.,  environments  in  which  hands 
and  eyes  are  busy). 

Weinstein  (1995)  described  several  mihtary  and  government  applications  of 
human-machine  communication  by  voice.  They  include  (1)  speech  recognition  and 
synthesis  for  mobile  command  and  control;  (2)  speech  processing  for  a  portable 
multifunction  soldier’s  computer;  (3)  speech-  and  language-based  technology  for  naval 
combat  team  tactical  training;  (4)  speech  technology  for  command  and  control  on  a  carrier 
flight  deck;  (5)  control  of  auxiliary  systems,  and  alert  and  warning  generation  in  fighter 
aircraft  and  helicopters;  and  (6)  voice  check-in,  report  entry,  and  communication  for  law 
enforcement  agents  or  special  forces.  With  respect  to  technological  needs,  mihtary 
applications  often  place  higher  demand  on  robustness  to  acoustic  noise  and  user  stress 
than  do  civihan  applications.  However,  military  applications  often  can  be  carried  out  in 
constrained  task  domains,  where,  for  example,  the  vocabulary  and  grammar  for  speech 
recognition  can  be  limited. 

Overview  of  Speech  Recognition  Literature.  Cohen  and  Oviatt  (1995)  identified 
several  situations  in  which  spoken  communication  with  machines  may  be  advantageous, 
such  as  when: 

•  the  user’s  hands  or  eyes  are  busy 

•  only  a  limited  keyboard  and/or  screen  is  available 

•  pronunciation  is  the  subject  matter  of  computer  use  (translators) 

•  natural  language  interaction  is  preferred 

They  suggested  that  in  many  applications  for  which  the  user’s  input  can  be  sufficiently 
constrained  to  permit  high  recognition  accuracy,  voice  input  leads  to  faster  task 
performance  and  fewer  errors  than  keyboard  entry.  Further,  discrete  word  recognition 
applications  have  been  successful  when  at  least  one  of  the  following  conditions  exist: 
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speaker’s  hands  are  busy,  mobility  is  required,  speakers  eyes  are  occupied,  or  harsh  or 
cramped  conditions  preclude  use  of  a  keyboard. 

Leggett  and  Williams  (1984)  conducted  an  experiment  to  assess  the  performance 
of  speech  input  relative  to  keyboard  input  for  computer  program  entry  and  editing  tasks. 
The  results  showed  that  the  subjects  were  able  to  complete  more  of  the  input  and  edit 
tasks  by  keyboard  than  by  voice,  but  that  keyboard  input  had  a  higher  error  rate  than  did 
voice  input.  These  results  should  be  interpreted  with  the  understanding  that  the  subjects 
were  novices  with  respect  to  voice  input.  Martin  (1989)  compared  speech  with  typed, 
full-word  input,  single  key  presses,  and  mouse  clicks.  Results  supported  the  benefits  of 
speech  input  over  typed,  full-word  commands,  and  to  a  lesser  extent,  single  key  presses. 
Cochran,  Riley,  and  Stuart  (1980)  showed  speech  input  was  slower  but  more  accurate 
than  typed  input  for  entering  interconnections  in  a  circuit  layout.  Haller,  Mutcschler,  and 
Voss  (1984)  performed  a  study  showing  that  voice  input  was  slower  and  less  accurate 
than  keyboard  input  for  positioning  the  cursor  and  correcting  typing  errors.  Visick, 
Johnson,  and  Long  (1984)  compared  speech  and  keyed  input  devices  for  entering  the 
destinations  in  a  parcel  sorting  task.  When  users’  hands  were  busy  at  the  sorting  task, 
voice  input  was  37%  faster  but  was  less  accurate  than  typed  input.  On  the  other  hand, 
Nye  (1982)  reported  that  voice  input  could  dramatically  reduce  error  rates  in  airline 
baggage  sorting  tasks.  Voice  input  of  baggage  destinations  was  performed  with  an  error 
rate  of  1%  as  opposed  to  an  error  rate  of  10%  to  40%  for  keyed  input. 

DeHaemer,  Wright,  and  Dillon  (1994)  reported  that  data  input  by  keyboard  was 
significantly  faster  than  input  by  speech  for  a  spreadsheet  task.  However,  for  accuracy, 
efficiency,  and  user  confidence,  no  significant  difference  was  evident  between  keyboard 
and  voice.  They  suggested  that  for  most  computer  users  and  spreadsheet  users,  the  best 
interface  ultimately  will  be  multimodal. 

Taken  as  a  whole,  these  studies  are  inconclusive.  The  quality  (and  cost)  of  the 
speech  recognition  technology  used  may  be  one  factor  responsible  for  the  variable  results. 
However,  the  advantages  are  clear  of  voice  input  over  keyboard  input  for  command 
activation  (Poock,  1982),  in  applications  for  the  handicapped  (Damper,  1984),  and  in 
certain  “hands  busy/eyes  busy”  applications,  such  as  package  sorting  and  computer-aided 
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drafting.  Speech  is  probably  more  efficient  than  typing  in  tasks  involving  short 
transactions  and  high  interaction  with  the  computer,  and  less  efficient  for  tasks  that  require 
thinking  time  or  long  interactions  (Chapanis,  1975;  Martin,  1989).  Studies  performed 
more  recently  have  focused  on  the  utility  of  voice  input  as  an  additional  input  channel  in 
multimodal  interfaces.  In  addition  to  the  successful  application  of  voice  to  “hands  busy” 
and  “eyes  busy”  tasks,  psychological  research  supports  the  view  that  people  are  more 
efficient  in  performing  multiple  tasks  distributed  across  multiple  response  channels  of 
differing  modes  (e.g.,  vocal  and  motor),  since  interference  of  tasks  in  the  same  modahty 
decreases  efficiency  (Karl,  Pettey,  &  Shneiderman,  1993;  Wickens,  Sandry,  &  Vidulich, 
1983). 

Problem 

During  combat,  documentation  of  medical  treatment  information  is  critical  for 
establishing  and  maintaining  continuity  of  patient  care.  However,  knowledge  of  the  prior 
status  and  treatment  of  patients  is  limited  to  the  information  noted  on  a  paper  FMC. 
Although  the  MEDTAG’s  two-button  data  entry  method  has  been  shown  to  document 
more  information  more  quickly  than  the  FMC,  there  is  considerable  interest  in  voice  data 
entry  because  a  corpsman’s  eyes  and  hands  are  busy  delivering  medical  care.  Further,  it 
is  hoped  that  this  “multitasking”  will  maximize  the  time  available  for  casualty  care. 
While  the  technology  for  recognizing  natural  speech  is  advancing  rapidly,  a  gap  remains 
between  normal  human  speaking  and  the  speech  a  computer  is  able  to  recognize. 
Nevertheless,  speech  recognition  technology  has  reached  a  level  where,  if  applications  are 
chosen  appropriately,  they  can  provide  a  means  for  communication  between  computers 
and  humans  which  although  not  error-free  is  approaching  the  acceptable  range.  Research 
suggests  that  care  must  be  taken  when  evaluating  the  utility  of  human-machine  voice 
communication  for  new  applications.  However,  medical  documentation  on  the  battlefield 
appears  to  be  a  task  that  may  take  advantage  of  voice  communication. 

Objective 

The  objectives  of  this  study  were  to  evaluate  three  data  entry  methods  for 
documenting  casualty  care  and  to  obtain  feedback  from  subjects  regarding  the  methods. 
To  evaluate  the  speed  and  accuracy  of  medical  data  entry,  corpsmen  were  trained  and  then 
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instructed  to  document  injury,  treatment,  patient  condition,  and  disposition  data  for  two 
hypothetical  patients.  Based  on  previous  findings  it  was  expected  that  the  subjects  would 
complete  each  scenario  in  less  time  and  make  fewer  errors  when  using  the  voice  input  than 
when  using  the  keyboard  or  the  two-button  methods.  Further,  it  was  expected  that 
subjects  would  prefer  voice  input,  that  they  would  find  using  voice  required  less  effort, 
was  more  comfortable,  and  was  faster  than  the  two-button  or  keyboard  methods. 

Method 

Subjects.  Twenty-four  Navy  corpsmen  from  the  1st  Medical  Battalion,  Camp 
Pendleton,  participated  in  this  study.  Their  average  age  was  29  years  and  all  had 
completed  Field  Medical  Services  School  training.  All  corpsmen  were  high  school 
graduates,  with  15  of  them  having  had  some  college  experience.  Seventy  percent  of  them 
had  been  deployed  at  least  once,  and  all  reported  having  FMC  experience.  All  but  one  of 
the  corpsmen  reported  some,  or  daily,  interaction  with  computers. 

McUerials.  The  MEDTAG  software  was  created  to  allow  the  user  to  maneuver 
through  a  series  of  menus  presenting  items  in  an  ordered  fashion.  Documentation  with  the 
current  MEDTAG  is  achieved  by  depressing  “Yes”  or  “No”  buttons  and  thus  moving 
through  a  series  of  menus  using  two  buttons.  The  three  injury  scenarios  used,  a 
fragmentation  wound,  a  bum  wound,  and  a  gunshot  wound,  are  shown  in  Appendix  A. 
The  scenarios  were  chosen  for  variety  and  utilization  of  the  MEDTAG  menu  items. 

Equipment.  For  this  study  a  desktop  computer  was  configured  to  simulate 
MEDTAG  operation.  The  software  was  designed  to  accept  input  from  the  keyboard,  the 
two-button  MEDTAG,  and  using  voice.  This  MEDTAG  simulator,  called  MEDSIM,  was 
used  as  the  platform  for  evaluating  the  speed  and  accuracy  of  entering  medical  treatment 
data  using  the  keyboard,  two-button  MEDTAG  model,  and  voice  (see  Figure  1). 
Although  all  three  methods  are  shown  in  this  figure,  the  participants  used  only  one  at  a 
time.  The  following  equipment  was  used:  a  Pentium-based  desktop  computer,  a  16-inch 
color  monitor  with  extended  keyboard;  a  noise  canceling  head-mounted  microphone;  an 
ergonomic  model  of  MEDTAG  with  functioning  buttons;  MARC  cards;  and  a  Datacard 
Series  50  Smart  Card  ReaderAVriter. 
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Figure  1.  MEDSIM:  The  desktop  computer  configured  to  use  keyboard,  two-button,  and 
speech  data  entry  methods. 


All  three  methods  of  data  entry  were  connected  to  the  computer.  The  data  input 
methods  used  were: 

Keyboard:  Only  the  “Y”  and  “N”  keys  from  a  standard  desktop  computer 
keyboard  were  used. 

Two-button:  The  two-button  method  of  data  entry  was  accomplished  by  using  a 
small,  hand-held  MEDTAG  model,  which  fits  in  one  hand.  The  “No”  button  was 
pressed  with  the  index  finger  and  the  “Yes”  button  with  the  thumb.  Input  was 
accomplished  using  the  parallel  port. 

Speech  recognition:  A  speaker-adaptive,  discrete  word  recognition  system, 
developed  with  Voice  Tools  (Dragon  Systems)  was  incorporated  into  the 
MEDSIM  software.  A  lightweight,  noise-canceling  microphone  and  headset, 
Shure  SmlOA,  was  positioned  approximately  1/2  inch  from  the  comer  of  the 
subject’s  mouth.  To  record  data,  the  subject  had  to  speak  clearly  (“Yes”/”No”) 
into  the  microphone.  As  a  part  of  training,  a  standard  registration  process  for 
the  user’s  voice  was  completed. 
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Care  was  taken  to  make  the  tasks  for  each  type  of  input  as  operationally  identical 
as  possible.  The  systems  provided  auditory  feedback  in  the  form  of  a  tone  immediately 
after  each  key  press,  two-button  press,  or  utterance  was  recognized. 

Procedure.  Each  subject  completed  a  demographic  questionnaire.  Training  and 
testing  took  place  in  a  quiet  office  environment.  Subjects  sat  in  front  of  the  computer 
screen  where  they  could  see  the  MEDSIM  software  being  presented.  The  subject  trained 
the  voice  recognition  system  using  the  following  words:  one,  two,  three,  four,  five,  six, 
seven,  alpha,  bravo,  charlie,  delta,  echo,  foxtrot,  golf,  yes,  no,  and  quit.  The  word  training 
process  took  about  2  min. 

The  training  and  testing  instructions  are  included  in  Appendix  B.  The 
fragmentation  wound  scenario  was  used  during  training.  The  scenario  was  explained  and 
a  copy  given  to  each  subject.  The  subjects  documented  the  condition  and  treatment 

information  using  all  three  methods  of  data  input. 

The  subjects  then  documented  two  test  scenarios,  bum  and  gunshot  wound,  using 
each  of  the  three  input  methods.  The  order  of  the  three  input  methods  was  randomized. 
Measurements  of  time  to  enter  these  data  and  the  accuracy  of  the  data  entered  were 
gathered  for  each  input  method.  The  computer  software  automatically  collected  and 
recorded  the  time  to  document  each  scenario.  System  processing  time  and  users  response 
times,  as  well  as  user  errors  and  recognition  errors  were  collected.  The  computer 
recorded  input  accuracy.  An  observer  recorded  recognition  errors. 

Subjects  also  completed  a  questionnaire,  shown  in  Appendix  C,  about  their 
preferences  and  experiences  with  the  three  data  input  methods. 

Results 

Documentation  Speed.  The  computer  automatically  gathered  times  taken  to 
complete  the  gunshot  wound  and  bum  scenarios.  Scenario  completion  time  was  adjusted 
for  differences  in  system  registration  times  for  the  three  methods:  13  ms  for  the  keyboard 
method,  157  ms  for  the  two-button  method,  and  504  ms  for  the  speech  input  method. 
Total  time  to  document  a  scenario  was  adjusted  using  these  registration  times  as 
constants.  So,  to  the  extent  possible,  an  attempt  was  made  to  control  for  the 


8 


imperfections  of  today’s  technology  and  focus  on  the  long-term  utility  of  speech 
technology. 

A  repeated-measures  analysis  of  variance,  with  data  entry  method  as  the  within- 
subjects  factor,  was  performed.  The  differences  in  the  times  required  by  the  three  data 
entry  methods  to  completely  document  the  combat  injury  scenarios  were  statistically 
signifcant,  F2.46  =  4.29,  £.<  .02.  Simple  effects  tests  revealed  that  the  two-button  method 
was  significantly  faster  than  the  voice  data  entry  method,  £1,46  =  8.53,  p  <  .01,  while  no 
significant  differences  were  found  between  the  two-button  and  keyboard,  or  the  keyboard 
and  voice  input  modes. 

Table  1  presents  the  means  and  standard  deviations  for  the  documentation  time 
for  the  three  input  methods.  The  average  corrected  times  taken  to  complete 
documentation  of  the  two  test  scenarios  were  6  min  10  s  using  the  two-button  method,  6 
min  24  s  using  the  keyboard,  and  6  min  40  s  using  the  voice  data  input  method. 
Documentation  time  using  the  two-button  method  was  34  s  faster  than  for  voice  data 
entry. 

Table  1 

Means  and  Standard  Deviations  for  Documentation  Time 
Using  the  Three  Entry  Methods 


Keyboard 

Two-Button 

Voice 

Mean  (in  seconds) 

384.73 

369.84 

403.69 

Standard  Deviation 

56.42 

65.01 

70.63 

Documentation  Accuracy.  Three  kinds  of  errors  denoting  inaccuracy  were 
monitored:  content  errors,  universal  errors,  and  method-specific  errors. 

Content  Errors.  Documentation  that  correctly  reported  casualty  medical  data  according 
to  the  scenario  was  rated  as  accurate.  Only  data  that  were  written  to  the  MARC  were 
used  in  this  analysis.  Documentation  that  was  either  inaccurately  recorded  or  completely 
missing  was  rated  as  a  content  error.  For  example,  if  the  scenario  required  the  user  to 
input  the  type  of  injury  as  a  penetrating  wound,  any  response  other  than  that  was 
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considered  inaccurate.  Another  content  error  occurred  when  vital  signs  were  bypassed 
and  not  entered  when  they  should  have  been. 

Universal  Errors.  Errors  resulting  from  the  software  or  hardware  common  to  all  three 
methods  were  considered  universal  errors.  For  example,  the  scroll  error  was  a  user  error 
common  to  the  three  data  entry  methods.  This  occurred  when  the  subject  bypassed  the 
intended  item  by  recording  too  many  “No”  responses.  Losing  the  place  on  the  menu  or 
becoming  confused  was  also  considered  a  universal  error.  Problems  with  the  Datacard 
reader/writer  or  the  MARC  itself  were  considered  hardware  errors  common  to  the  three 
methods.  The  frequency  of  content  and  universal  errors  for  each  data  entry  method  for 
the  two  test  scenarios  are  presented  in  Table  2.  A  goodness  of  fit  test  for  the  content 
errors  did  not  yield  a  significant  chi-square.  However,  a  significant  chi-square  was  found 
for  the  universal  errors,  =  9.52,  p  <  .01.  Of  the  three  types  of  universal  errors  that 
could  be  made,  scroll  errors  accounted  for  77%  of  the  errors  for  the  keyboard  method, 
77%  for  the  two-button  method,  and  60%  for  the  speech  data  input  method. 

Table  2 

Frequency  of  Content  and  Universal  Errors  by  Data  Entry  Method 


Content  Errors 

Universal  Errors 

Total 

Keyboard 

28 

45 

73 

Two-Button 

31 

34 

65 

Speech 

22 

20 

42 

Total 

81 

99 

180 

Although  the  difference  in  the  number  of  content  errors  made  using  the  three 
methods  was  not  significant,  the  impact  of  the  errors  should  be  examined  further.  An 
incorrect  response  could  influence  menu  navigation  in  one  of  three  ways:  1)  it  could  effect 
that  one  screen  only,  2)  it  could  require  the  subject  to  respond  to  additional  screens,  or  3) 
it  could  allow  the  subject  to  skip  several  subsequent  screens.  Content  errors  that  add  or 
delete  screens  could  increase  or  decrease  the  time  required  to  document  the  injury,  it  is 
very  difficult,  however,  to  determine  the  exact  amount  of  time  these  errors  represent. 
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Method-Specific  Errors.  Errors  occurring  as  a  result  of  the  individual  data  entry  method 
are  considered  here.  There  were  no  keyboard  errors  and  only  four  errors  were  due  to  the 
two-button  method.  There  were  29  errors  attributed  to  the  speech  recognition 
technology.  These  included  speaking  while  the  mike  was  off  and  speaking  a  word  that  the 
system  did  not  recognize.  Of  the  5,928  utterances  required  by  the  task  the  voice 
recognition  system  failed  to  recognize  only  eight.  This  accounts  for  .135%  used  during 
the  experiment.  This  type  of  error  requires  the  user  to  restate  the  word.  On  the  other 
hand,  one  misrecognition  error  occurred  for  the  entire  study.  This  type  of  error  results  in 
the  unintended  recording  of  information. 

Perceptions  of  the  Data  Entry  Methods.  The  subjects  were  asked  to  indicate 
their  level  of  agreement  with  statements  regarding  each  data  entry  method.  For  each  data 
input  method  they  were  asked  (1)  how  easy  it  was  to  learn,  (2)  whether  it  was  easy  to 
make  mistakes  using  that  method,  (3)  whether  data  entry  with  that  method  was  faster  than 
with  the  FMC,  and  (4)  whether  the  method  was  uncomfortable  or  awkward.  Means  and 
standard  deviations  are  shown  in  Table  3. 

Table  3 


Means  and  Standard  Deviations  for  the  Attitude  Items  Regarding  the  Data  Entry  Methods 


Item 

Keyboard 

Two-Button 

Speech 

1.  Method  is  easy  to  learn. 

Mean 

1.41 

1.62 

1.41 

SD 

.58 

.87 

.58 

2.  Easy  to  input  data  using  this  method. 

Mean 

1.41 

1.75 

1.70 

SD 

.58 

.94 

.85 

3.  Method  takes  less  time  than  FMC. 

Mean 

1.75 

2.00 

2.15 

SD 

.73 

1.10 

1.03 

4.  Easy  to  make  mistakes  with  method. 

Mean 

3.41 

2.86 

3.00 

SD 

1.05 

.96 

1.14 

5.  Takes  longer  to  use  than  FMC. 

Mean 

3.75 

3.71 

3.41 

SD 

1.03 

1.04 

1.28 

6.  Method  is  uncomfortable/awkward. 

Mean 

3.83 

3.54 

3.54 

SD 

.86 

.93 

.83 

Note.  A  5-point  scale  was  used.  1=  Strongly  Agree;  2=  Agree;  3=  Neutral;  4-  Disagree;  5-  Strongly 
Disagree. 
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In  general,  the  corpsmen  reported  that  keyboard  and  voice  were  easy  to  learn  and 
to  use  for  inputting  data,  that  the  keyboard  and  the  two-button  methods  took  less  time 
compared  with  the  FMC,  and  that  it  was  easy  to  make  mistakes  using  the  two-button 
method.  In  addition,  the  corpsmen  reported  that  they  felt  comfortable  speaking  to  the 
computer  and  that  the  microphone  was  comfortable.  The  corpsmen  also  reported  that 
they  thought  paper  copies  of  medical  records  were  very  important,  and  that  they  were 
satisfied  with  the  “Yes’T’No”  format  of  the  input  screens. 

Table  4  presents  the  responses  to  the  forced-choice  items. 

Table  4 

Responses  to  the  Forced-Choice  Attitude  Items 


Item  FMC 

Keyboard 

Two- 

Button 

Speech 

n(%) 

n(%) 

n(%) 

n  (%) 

1 .  Overall,  which  do  you  like  best?  0  (0) 

8(33) 

11  (46) 

5(21) 

2.  Overall,  which  do  you  like  least?  9  (37) 

3(12) 

4(17) 

8(33) 

3.  Which  do  you  think  is  the  fastest?  0  (0) 

11 (46) 

8(33) 

4(17) 

4.  Which  do  you  think  is  the  slowest?  10  (42) 

0(0) 

3(12) 

11  (46) 

5.  Which  do  you  think  collects  the  0  (0) 

most  accurate  information? 

13  (54) 

6(25) 

4(17) 

6.  With  which  do  you  think  you  make  5  (21) 

the  most  mistakes? 

7(29) 

3(12) 

7(29) 

7.  Which  would  you  prefer  to  use?  0  (0) 

5(21) 

11  (46) 

6(25) 

8.  Which  would  work  best  in  0  (0) 

combat? 

0(0) 

19  (79) 

4(17) 

9.  Which  would  improve  field  0  (0) 

medical  care  the  most? 

1(4) 

14  (58) 

8(33) 

10.  Which  would  be  the  least  2  (8) 

intrusive? 

3(12) 

9(37) 

7(29) 

1 1.  Which  would  give  your  hands  the  1  (4) 

most  freedom? 

0(0) 

1(4) 

20  (83) 

12.  Which  would  interfere  least  with  0  (0) 

your  duties? 

0(0) 

4(17) 

17(71) 
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When  asked  which  method  they  hked  best  almost  half  (46%)  of  the  corpsmen 
chose  the  two-button  method,  and  when  asked  which  they  liked  least  70%  chose  either  the 
FMC  or  the  voice  input.  The  corpsmen  reported  that  the  keyboard  or  the  two-button 
method  were  the  fastest,  and  the  FMC  and  voice  were  the  slowest.  They  also  reported 
that  they  thought  the  most  accurate  information  was  collected  using  the  keyboard  method. 
When  asked  which  method  they  would  prefer  to  use,  which  would  work  best  in  combat, 
and  which  would  most  improve  field  medical  care,  the  two-button  method  was  chosen 
most  often.  Further,  the  voice  input  method  was  most  frequently  chosen  when  asked 
which  method  would  give  them  the  most  hand  freedom  and  which  would  interfere  least 
with  their  duties. 

Table  5  shows  the  responses  when  the  corpsmen  were  asked  to  choose  the  best 
combination  of  methods.  The  two-button  method  was  the  most  frequently  mentioned  in 
some  combination.  Not  reflected  in  the  table  were  the  3  corpsmen  who  reported  that 
keyboard,  two-button,  and  speech  would  be  the  best  combination,  and  the  one  who 
reported  that  two-button  alone  was  best. 

Table  5 

Preferred  Combination  of  Data  Entry  Methods 
FMC  Keyboard  Two-Button  Speech 

FMC  —  1 

Keyboard 
Two-Button 
Speech 

Discussion 

Designing  an  effective  user  interface  for  a  voice  application  involves 
consideration  of  (1)  the  information  requirements  of  the  task;  (2)  the  limitations  and 
capabilities  of  the  voice  technology,  and  (3)  the  expectations,  expertise,  and  preferences  of 
the  user  (Kamm,  1995).  In  general,  speech  recognition  was  found  to  be  slower,  yet 
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somewhat  more  accurate, ..than  either  the  keyboard  or  the  two-button  method,  and  users 
reported  a  preference  for  the  two-button  method. 

MEDSIM,  the  PC-based  simulator  and  software,  performed  extremely  well  in  this 
study.  The  equipment,  card  reader,  two-button  interface,  and  speech  recognition 
technology  systems  all  performed  better  than  had  been  anticipated.  The  speech 
recognition  rate  was  extremely  high  (99.99%),  partly  as  a  function  of  the  highly 
constrained  vocabulary  and  controlled  speaking  conditions.  Although  the  vocabulary  for 
the  speech  recognition  was  limited,  the  system  was  robust  enough  to  handle  external 
environmental  noise  (e.g.,  lawnmowers,  other  voices). 

The  finding  that  the  two-button  and  keyboard  data  input  methods  were  faster 
than  the  speech  method  could  be  attributed  in  part  to  the  novelty  of  the  speech  recognition 
system.  Although  training  using  the  voice  method  was  provided,  the  technology  was  new 
to  the  users.  The  34  s  advantage  of  the  keyboard  method  may  not  appear  significant, 
however,  it  could  be  very  critical  in  the  context  of  combat  casualty  care.  The  speech 
recognition  method  did,  however,  produce  fewer  errors  overall.  The  novelty  of  the 
method  may  not  only  have  slowed  the  users  down,  it  also  may  have  made  them  more 
cautious.  This  could  explain  the  significantly  fewer  scroll  errors  made  using  the  speech 
technology  compared  with  both  the  keyboard  and  two-button  methods.  In  general,  the 
corpsmen  preferred  the  two-button  method  over  the  keyboard  and  voice  input  methods. 
However,  the  participants  expressed  the  desirability  of  having  speech  input  as  an  option 
for  medical  documentation. 

These  findings  are  partially  consistent  with  DeHaemer  et  al.  (1994)  who  found 
that  data  input  by  keyboard  was  significantly  faster  than  by  speech.  For  accuracy, 
keystroke,  efficiency,  and  user  confidence,  they  found  no  significant  differences  between 
keyboard  and  voice  input.  They  suggested  the  best  interface  will  ultimately  be 
multimodal.  This  would  take  advantage  of  the  combination  of  voice,  keyboard,  and 
touchscreen  input  to  suit  the  task  and  the  user.  In  addition  to  the  successful  application  of 
voice  to  “hands  busy”  and  “eyes  busy”  tasks,  psychological  research  supports  the  view 
that  people  are  more  efficient  in  performing  multiple  tasks  distributed  across  multiple 
response  channels  of  differing  modalities  (e.g.,  vocal  and  motor),  since  interference  of 
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multiple  tasks  perfonned  in  the  same  modality  decreases  efficiency  (Chapanis,  1975; 
Martin,  1989;  Wickens,  Sandry,  &  Vidulich,  1983).  Building  good  speech  recognition 
systems  should  enhance  user-computer  interaction  because,  in  multi-task  situations,  it 
provides  an  additional  response  channel  over  which  the  workload  can  be  spread. 

This  study  also  suggests  that  speech  input  capabilities  may  change  the  nature  of 
what  information  users  enter  into  the  system.  Although  use  of  speech  input  capability  was 
not  successful  in  this  study,  it  may  become  more  useful  when  users  are  allowed  to  enter 
the  name  of  the  injury,  the  type  of  treatment,  and  the  condition  of  the  patient.  Modifying 
the  task  to  more  closely  resemble  the  job  performed  has  been  shown  in  other  studies  to 
improve  the  speed  and  accuracy  of  the  voice  input  method  (Kamm,  1995). 

A  gap  exists  between  performance  in  the  lab  and  performance  in  the  field, 
however.  Problems  not  encountered  in  the  lab  but  very  likely  to  be  found  in  the  field 
might  include  variation  in  speaking  style,  noise,  ambiguity  of  language,  or  confusion  on 
the  part  of  the  speaker.  Therefore,  applied  evaluations  of  speech  input  devices  are  an 
important  source  of  information.  They  may  support  the  claim  that  one  value  of  speech 
input  devices  lies  in  freeing  users  from  the  keyboard,  enabling  them  to  use  their  hands  for 
one  aspect  of  a  task  and  their  voices  for  another.  With  respect  to  technological  needs, 
military  applications  often  place  higher  demand  for  robustness  in  the  presence  of  acoustic 
noise  and  user  stress  than  do  civilian  applications.  But  military  applications  often  can  be 
carried  out  in  constrained  task  domains,  where,  for  example,  the  vocabulary  and  grammar 
for  speech  recognition  can  be  limited. 

The  voice  communication  interface,  which  is  critical  to  user  acceptance  of  voice 
processing  technology,  also  needs  to  be  examined.  Interface  issues  that  need  to  be 
considered  include  vocabulary  size  and  content,  continuous  speech  versus  isolated  words, 
constraints  on  grammar  and  speaking  style,  the  need  for  training  of  the  recognition  system, 
the  quality  and  naturalness  of  synthetic  voice  response,  the  way  the  system  handles  its 
errors  in  speech  understanding,  and  the  availability  and  convenience  of  alternate 
communication  modalities.  Voice  in  combination  with  full  screen/keyboard  entry  would 
allow  the  user  to  jump  to  specific  sections  of  the  software.  This  could  be  beneficial  for 
time-constrained  situations  when  only  essential  information  can  be  gathered  and 
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transferred.  Future  studies  will  explore  the  impact  of  vocabulary  size  and  content,  as  well 
as  the  effect  of  alternate  communication  modalities  on  the  speed  and  accuracy  of  medical 
documentation. 
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Appendix  A 


Injury  Scenarios 
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INJURY: 


MEDTAG  Scenario  -  #1  (FRAG^ 


TREATMENT: 


FRAG -LEFT  LEG 
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RIGHT  KNEE 
BATTLE  DRESS 
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Continue  to  next  page  -> 


Continue 


A-3 


Continue  to  next  page  •> 
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Continue  to  next  page  -> 
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4.  Voluntary  Disclosure.  Provision  of  information  is  voluntary.  Failure  to  provide  the  requested  information  may  result  in  failure  to  be 
accepted  as  a  research  volunteer  in  an  experiment,  or  in  removal  from  the  program. 
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